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Abstract 

The complexity of a computational problem is traditionally quantified based on the hardness 
of its worst case. This approach has many advantages and has led to a deep and beautiful theory. 
However, from the practical perspective, this leaves much to be desired. In application areas, 
practically interesting instances very often occupy just a tiny part of an algorithm's space of 
instances, and the vast majority of instances are simply irrelevant. Addressing these issues is a 
major challenge for theoretical computer science which may make theory more relevant to the 
practice of computer science. 

Following [BL], we apply this perspective to MAXCUT, viewed as a clustering problem. 
Using a variety of techniques, we investigate practically interesting instances of this problem. 
Specifically, we show how to solve in polynomial time distinguished, metric, expanding and dense 
instances of MAXCUT under mild stability assumptions. In particular, (1 + e)-stability (which 
is optimal) suffices for metric and dense MAXCUT. We also show how to solve in polynomial 
time r2(Y^)-stable instances of MAXCUT, substantially improving the best previously known 
result. 
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1 Introduction 



The primary criterion used in computational complexity to evaluate algorithms is worst case behav- 
ior, so that a problem is infeasible if no efficient algorithm can solve all its instances. In practice, 
this approach is often overly pessimistic, and a more realistic (but fuzzy) criterion would be to say 
that a problem is feasible if there is an efficient algorithm that correctly solves all of its practi- 
cally interesting instances. The difference can be very substantial, since for many computational 
problems, the vast majority of instances are completely irrelevant for practical purposes. 

An important case in point is clustering, where one seeks a meaningful partition of a given set 
of data. Almost every formal manifestation of the clustering problem is A^P-Hard, yet, a clustering 
instance is of practical interest only if the data can indeed be partitioned in a meaningful way. 
Random instances are not likely to have a meaningful partition, so data sets with a meaningful 
partition are very special. Thus, even if no efficient algorithm can find the optimal partition for 
every data set, this does not imply that clustering is hard in practice. As Tali Tishby put it in 
conversation many years ago, many practitioners hold the opinion that "clustering is either easy or 
pointless". That is, for a data sets that admit a meaningful partition of the data, finding it is not 
hard. 

Can this intuition be put on a solid theoretical foundation? Bilu and Linial [ ] proposed a 
framework for studying this issue. Generally speaking, their approach pertains to optimization 
problems with a continuous input space and discrete solution space. They proposed two criteria 
for an optimal solution to be evidently optimal. A solution is stable if it remains optimal under 
moderate perturbations of the input. A solution is distinguished if a transition to another solution 
reduces the value of the objective function in proportion to the distance between the two solutions. 
Concretely, they considered the case where the input is a weighted graph and the candidate solutions 
are cuts (or more generally, partitions). Here, a cut is 7-stable (for 7 > 1) if it remains optimal 
even if each input weight Wij is perturbed to a value between Wij and 'jWij . A cut is a- distinguished 
(for a > 0) if moving to any other cut reduces the objective function by at least a times the sum 
of (weighted) degrees of the vertices that switched side. 

Following Bilu and Linial [i> L], we apply these notions to the study of the (weighted) MAXCUT 
problem. We also investigate the more restricted problem of Metric-MAXCUT^ which arises often 
in the field of machine learning. Our main results are: 

Theorem 1.1 1. For every e > there is a polynomial time algorithm that correctly solves all 
(1 + e)-locally stable instances of Metric-MAXCUT. 

2. For every e > and C > 1 there is a polynomial time algorithm that correctly solves all 
(1 + e)-locally stable and C-dense instances of MAXCUT. 

The condition of C-density rules out overly-weighted edges. The notion of 7-local stability is 
a substantial weakening of 7-stability. It is defined similarly, but we only require resilience to 
perturbations that modify edges which are all incident with the same vertex. 

Theorem 1.2 There is a polynomial time algorithm that solves all instances of MAXCUT that are 



1. a- distinguished and ^-locally stable with 7 > 



2 



or 



2. ^-locally stable with 7 > 



2 



That is, MAXCUT, restricted to instances where the weight function is a metric. 
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Here h is the Cheeger constant of the maximal cut. 

This substantially improves a result from [ ] that works only for regular graphs and requires that 
^ ^ or 7 > It is also shown in [: ' ] that n-stable instances are feasible. Here we 

derive the same conclusion under the weaker (but still impractical) assumption of il(y^)-stability. 



Theorem 1.3 There is a polynomial time algorithm that finds the optimal solution for every 
0,{y/n) -stable instance of MAXCUT. 

Some notation and terminology 

Here the input to the MAXCUT problem is the complete graph on n vertices G = {V^E) along with 
a symmetric function with zero diagonal w : V xV ^ M"*". Expressions such as is bipartite" 
refer to the graph which is the support of w, which is always assumed to be connected. Our purpose 
is to find a cut {S, S), S C y for which YlaeS bes '^{^i ^) is maximized. 

Fix a cut {S,S). We use the self-explanatory terms "the vertices x,y are on the same side" or 
"separated" by this cut. We call the edge xy a cut edge or a non-cut edge when x, y are separated 
resp. on the same side of the cut. For A,BcV, we denote E{A,B) = {ab\a £ A, b £ B} and 
w{A,B) := J2uveE{A,B)wiu,v)- Also t^{A) = t{A) = wiA,A) and = fi^{A) = w{A,V). 

Let A C V. We denote by ^(^), the weight of the cut edges emanating from S, i.e., £,{A) = 
X^i)Me-B(A A)nE{s S) "^Wi ^) by i{A) = t{A) — i{A) the weight of the non-cut edges. We slightly 
abuse notation for singletons A = {v} and pairs A = and write t{v) or t(e) etc., where 

e = uv. The minimal, maximal and average degree of w are denoted by 6{w) = min^gy ^(v), 

5{w) = max^gy and 5{w) = ^"'^^ respectively. (The potentially confused reader may find 
the following Greek-mathematical dictionary useful: r stands for "total" , ^ for "external" and l for 
"internal"). 

1.1 Stable instances 

Definition 1.4 Let w :V xV ^ [0,oo) be an instance of MAXCUT and let 7 > 1. An instance 
It;' : y X y — [0, 00) is a 7-perturbation of w if 

Vn, u G y, w{u, v) < w'{u, f ) < 7 • w{u, v) 

An instance w is said to be 7-stable if there is a cut which forms a maximal cut for every 7- 
perturbation w' ofw. 

Definition 1.5 Let 7 > 1. An instance w : V x V ^ [0,oo) for MAXCUT is 7-locally stable if 
there is a maximal cut (S, S) for which it is impossible to obtain a larger cut by switching the side 
of some vertex x and multiplying the edges in E(x, V \ {x}) by numbers between 1 and 7. 

The definitions of stability and local stability capture the intuition of an "evidently optimal" so- 
lution. The following more concrete equivalent definitions are usually more convenient to use. 

Observation If J Let w : V x V —?- M. be an instance of MAXCUT and let j > 1. 

• The instance w is j-stable iff there is a maximal cut for which £,{A) > 7 • i{A) for every 
AcV. 
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• The instance w is ^-locally stable iff there is a maximal cut for which ^{x) > 7 • i{x) for every 
X eV. 

We say that a (not necessarily maximal) cut {S, S) is j-stable (resp. ^-locally stable) if the first 
(resp. second) condition in Observation 1 holds. 

As Observation 1 shows, every instance is 1-stable, and being 7-stable for some 7 > 1 is 
equivalent to having a unique maximal cut^. Finally, an instance is bipartite iff it is 7-stable for 
every 7 > 1. Thus, 7-stability is seen to be a relaxation of being bipartite. 

Stability and local stability are quite different. As mentioned, for 7 > 1 every instance has at 
most one 7-stable cut. On the other hand, there can be numerous 7-locally stable cuts: Consider 
the instance where w = 1 on the edges of a perfect matching and e > elsewhere. As e — )• 0, the 
local stability tends to 00. Yet, this instance is not 7-stable for any 7 > 1. It is easy to check 
that this instance has exponentially many 7-locally stable maximal cuts. From the computational 
perspective the two properties are very different as well. Thus MAXCUT remains A^P-hard even 
under arbitrarily high local stability (see [BL]), whereas we show here how to efficiently solve 
r2(-v/n)-stable instances. Also, it is easy to decide whether a given cut is 7-locally stable, but we do 
not know how to decide whether a given cut is 7-stable and we suspect that this problem is hard. 

In Section 4, following a simplified version of the algorithm in [BL] for Q(n)-stable instances, we 
present a deterministic algorithm that solves every J7(-yn)- stable instance, proving Theorem 1.3. 

1.2 Distinguished and Expanding instances 

Let w -.V xV ^ M"^ be an instance of MAXCUT whose (unique) maximal cut is (S", S). We note 
that if all vertices oi A dV switch side, then the weight of the cut decreases by ^^(^4) — i{A). Thus, 
we define 

Definition 1.6 An instance w of MAXCUT is a-distinguished for a > if for every ^ ^ A dV , 
i{A) - l{A) > a ■ mm{^i{A),^i{A)]. 

Note that every instance is 0-distinguished and being a-distinguished with a > is equivalent 
to having a unique maximal cut. It is not hard to see that j^^-local stability is equivalent to a-local 
distinction, namely ^(x) — l{x) > a ■ fi{x) for every x € V. 

Distinction vs Stability. Let (S, S) be a maximal cut of w : V xV ^ [0, 00). On the one hand, 
every a-distinguished instance is j^^-stable, because (,{A) — l{A) > afj,{A) > a{^{A) + l{A)). On 
the other hand, highly stable instances need not be distinguished as the following bipartite example 
with V = {ai, . . . , a„}U{6i, . . . , bn} shows. Here w{ai, bj) is 1 when i = j and e <C 1 otherwise. 
Clearly w is 00-stable. Yet, switching the sides of all the vertices in {ai, . . . , a^} U {61, . . . , 6|} 
decreases the weight of the cut only slightly. Such examples motivate the stronger notion of 
distinction. Although the cut ({ai, . . . , a„}, {61, . . . , 6^}) is infinitely stable, its optimality does 
not seem completely evident. 

Distinction and Expansion. Call w : V x V ^ M.^ fi-expanding if /? < h{w) where 
h{w) = min0-^yi(-y ram{ij,{A) ii{A)} Checgcr constant. An a-distinguished instance is a- 

expanding, though highly expanding instances can even have multiple maximal cuts. However, 
an instance that is both 7-stable and /3-expanding is easily seen to be (/3 • ^^)-distinguished. As 
this discussion implies, distinction is a conjunction of stability and expansion. 

In section 3 we prove Theorem 1.2, using a spectral result from [BL]. In the appendix we 
re-derive this result and point out its close relation to the Geomans- Williamson algorithm [ ] 
and other spectral techniques. 

^To see that, note that if {S, S) is a 7-stable cut and (T, f) is another cut then w{T, f) = w{S, S) - ^((T n 5) U 

(f n s)) + l{(t n s) u (t n s)) < wis, s) - ?=f r((r n s) u (t n s)) < w{s, s). 
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1.3 Metric and Dense instances 



In Section 2 we study metric instances. This is done through a reduction from metric to dense 
instances, so we consider such instances as weU (Section 2.1). 

We call w.VxV^R C-dense for C > 1 if Vx,y G V, w{x,y) < C - As shown in [AKK], 
for C > 1 fixed, C-dense MAXCUT is NP-Rard, but it has a PTAS. As we show, this PTAS can 
be adapted to correctly solve all instances of MAXCUT that are (1 + e)-locally stable and C-dense 
for every e > 0, C > 1. The algorithm samples O(logn) vertices and tests each of their bipartitions 
as a seed to a cut. As we show, w.h.p., one of the resulting cuts is the maximal cut, proving the 
second part of Theorem 1.1. 

In Section 2.2 we deal with Metric-MAXCUT. As shown in [ ] (with credit to L. Trevisan) 
Metric-MAXCUT is A'^P-Hard. That paper also gives a reduction from metric to (4 + o(l))-dense 
instances of MAXCUT, thus yielding a PTAS for Metric-MAXCUT. We show that a shght variation 
of this reduction preserves local stability^ and therefore yields an efficient algorithms for (1 + e)- 
locally stable instances of Metric-MAXCUT, proving Theorem 1.1 in full. 

This algorithm for metric instances is far from being a practically applicable clustering method. 
Even though it is polynomial-time, the actual run times are prohibitively high. We view this more 
as an invitation to seek practical algorithms for 7-stable instances of metric MAXCUT for some 
reasonable values of 7. Specifically we provide such an algorithm for (3 + e)-locally stable metric 
instances. 

1.4 Relation with other work 

Smoothed analysis is the best known example of a method for analyzing instances of computational 
problems based on their practical significance. As this method shows [ST], a certain variant of the 
simplex algorithm solves in polynomial time almost every input. Even closer to our theme are 
several recent papers on clustering. In [ ] polynomial time algorithms are given for 3-stable 
instances of /c-means, A:-medians and other "center based" clustering problems. The constant 3 was 
improved in [BL2] to (1 + \/2) for A;- median. The papers [DLS, AB, BBV] consider data sets that 
admit a good clustering and show how to cluster them efficiently. 

Also related to our work are the planted partition model [ ] and semirandom model [ ] for 
MAXCUT. In these models instances are generated by splitting the vertices at random V = SUS. 
Edges in S X S (resp. S x S U S x S) are picked with probability p, resp. q < p. In the semirandom 
model we also allow an adversary to add edges to S x S and drop edges from S x S U S x S . As 
shown in [B, FK], a.a.s., {S,S) is the maximal cut and it can be efficiently found using certain 
algorithms. It not hard to see that for fixed p and q, this is a consequence of Theorem 1.1. The 
planted partition model is a random model that usually generates instances with a good partition, 
and those can be efficiently found. The semirandom model goes further by allowing an adversary 
to modify the input in a way that improves the optimal partition. Here we take an additional step 
forward, since we solve efficiently every instance with a good partition. 

word of caution: Our definition of stability and local stability for Metric-MAXCUT is more restrictive than 
one might think. We require the perturbed instance to satisfy the stability condition whether or not it is metric. 
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2 Algorithms for locally stable dense and metric instances 



2.1 Dense instances 

Theorem 2.1 For every C > 1 and e > there is a randomized polynomial time algorithm that 
correctly solves all (1 + e) -locally stable, C-dense instances of MAXCUT. 

The analysis of the algorithm is based on the following lemma. 

Lemma 2.2 Suppose that w : V x V ^ [0,oo) is a C-dense instance and let {S,S) be a ^-locally 
stable cut. Let Xi, . . . ,Xm be i.i.d. r.v. that are uniformly distributed on V. For x £V, let be 
the event that 5+ > 5_, where S± = Yl w{x, Xi) over all i s.t. x and Xi are separated resp. on the 
same side. Then 

Pr {VJM <\V\- exp (^"2 • " ^ 

Proof The lemma follows from Hoeffding's bound. For every x £ V, 5"+ — 5*- is a sum of m i.i.d. 
r.v.'s of expectation > These r.v.'s are bounded in absolute value, by C ■ -j^- 

□ 

Proof (Of Theorem 2.1) Let D = 2 {C ■ Let m = D ■ ln(2|l^|). Take an i.i.d. sample of m 

uniformly chosen points Xi, . . . ,Xm G V. By the above lemma, with probability > 0.5, there is a 
partition {Xi, . . . , Xm} = L ]J -R such that the cut defined hy S = {x £ V : w{x, R) > w{x, L)} is 
the optimal cut. Since the number of such partitions is (2 • ll/D''^^^)-^, there are only polynomially 
many partitions to consider, yielding an efficient randomized algorithm for the problem. 

□ 

Corollary 2.3 For every C > 1 and e > 0, a C-dense instance of MAXCUT has only poly{\V\)- 
many (1 + e) -locally stable cuts. 

Proof Consider the random cut {S, S), sampled as in the proof of Theorem 2.1, where the partition 
{Xi, . . . , Xm} = LY[R is chosen uniformly at random. The proof Theorem 2.1 shows that for every 
(l + e)-locally stable cut {T,f), the probability that {S,S) = {T,f) is > 0.5 • (2 • r^^^^^ 
there are at most 2 • (2 • such cuts. 

□ 



2.2 Metric instances 

Given an instance w : V x V ^ [G, co) of MAXCUT, we split its vertices as follows. Pick a set V 
and a surjective map vr : y — )■ y. A MAXCUT instance w oxiV \s defined as follows: 

~x w{x,y) 
w{x,y) - 



\tt • Ivr 

where 7r(x) = x, 'K{y) = y. It is not hard to prove that 

Proposition 2.4 Consider the following map from cuts of w to cuts of w defined by 

iS,S)^in-HS),7r-\S)) 

Then 
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1. This map preserves weights, stability and local stability of cuts. 

2. Restricted to the locally stable cuts (i.e., ^-locally stable cuts with 7 > Ij, this is a bijection 
onto the locally stable cuts of w. 

3. It maps maximal cuts to maximal cuts. 

As the following proposition shows, the above construction is a reduction from metric to (4 + o(l))- 
dense instances. 

Proposition 2.5 Let w : V x V ^ [0,oo) be an instance of Metric-MAXCUT with w{y,V) = 
2 • \V\'^ . Consider the map tt : IJa;ev[L''""'(^)J] ~^ ^ ■ -^^^ instance w obtained by vr is (4 + o(l))- 
dense. 

Proof Let x,y such that 7r(a;) = x,TT{y) = y. It is easy to see that (see [Vl\]) 2 • |F| • r^(2;) > 
w(y,V), [t„(x)J > (1 - lyy) Tuiix), Tu,{x) = j^r^ > 1 and w{x,y) < |y|(r^(x) + r„(y)). Thus, 
we have 



w{x,y) 



w{x,y) 



< 



< 



1 w{x,y) 



(l-l/|y|)^ Ty,{x)-TM 
1 T^\[Twix) + 

{i-i/\v\f' T^{x)-TUy) 

1/1 1 

+ 



< 



< 



{i-i/\v\f y\v\TUx) ' \v\TUy) 

1 4 



{l-l/\V\f w{V,V) 
1 4 



{i-i/\v\f \V\ 

Tu,{x) 



(4 + o(l))- 



1^1 



□ 



Corollary 2.6 Let e>0. 

1. There is a randomized polynomial time algorithm for {1 + e) -locally stable instances of Metric- 
MAXCUT. 

2. The number of (1 + e)-locally stable cuts in a metric instance is polynomial in \V\. 
2.2.1 A faster algorithm for (3 + e)-stable metric instances 

Proposition 2.7 Let {L,R) be aj-locally stable cut of an instance, w, of Metric-MAXCUT. Then, 
for every x £ L,z £ R, w{x, z) > (j^j • r\R\ML\ ■ 
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Proof Using 7-local stability and the triangle inequality we obtain 

yeL 

J2iw{z,y) -w{x,z)) 

yeL 

w{z, L) — \L\w{x, z) 
^w{z,R) — \L\w{x,z) 

-f'^w{z,y) - \L\w{x,z) 
yen 

7 x) — w{z, x)) — \L\w{x, z) 

yeR 

^w{x, R) — 'y\R\w{x, z) — \L\w{x, z) 

□ 

Theorem 2.8 Let {X, w) be an instance of Metric-MAXCUT and let (L, R) be aj = {?, + e) -locally 
stable cut with e > 0. Then either L or R is a (metric) ball. 

Proof W.l.o.g., \L\ > |. We find some x G L such that Vz G R, w{z,x) > diam(L), thus 
proving our claim. Select some x,y £ L with w{x,y) = diam(L). For every z G L, we write 
w{x, y) < w{x, z) + w{y, z). Summing over every z £ L, this yields \L\ ■■w{x,y) < ■w{x, L) -\-w{y, L). 
W.l.o.g., assume that w{x,L) > ^ • w{x,y). By local stability, 

2 , ^, 2 • w(x, R) , , 

w{x, y) < —w{x, L) < \^ (1) 

\L\ 7 • \L\ 

By proposition 2.7, every z £ R satisfies w{x,z) > • . Combined with equation (1), 

and the assumptions that 7 > 3 and \L\ > \R\, we obtain that w{x, z) > w{x,y) as claimed. 

□ 

By Theorem 2.8, the maximal cut of (3 + e)-locally stable instances of Metric-MAXCUT can be 
found by simply considering all 0{n?) balls. 

Note 2.9 Theorem 2.8 is tight in the following sense. We show an example of {'i — e)- stable metric 
instance (not just locally- stable), where neither side of its maximal cut is a ball, nor can it even be 
expressed as the union of few balls. 

Here is the example: R is a metric space {X,w) = {LWR,w) where L = {h, . . . ^hn} , R = 
{ri, . . . ,r2n}- Generally speaking, the distance between two points which are both in L or in R is 
1. The distance between a point in L and a point in R is 3, the following are exceptions to the 
general rule: \/l < i < n, w{l2i~i,l2i) = w{r2i-i,r2i) = 2 and VI < i < 2n, w{li,ri) = 2 R is not 
hard to see that w is a {3 — o{l))-stable metric instance and each side of its maximal cut cannot be 
decomposed into fewer than 2n balls. 



—w{x,R)>w[x,L) = 
7 

> 



> 
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3 Distinguished and Expanding Instances 



Let w : V xV ^ [0,oo) be an instance of MAXCUT with a maximal cut {S,S). We identify w 
with an n X n matrix W, where Wij = w{i, j). Define Wcut ■ V xV ^Mhy Wcut{u, v) = 'w{u, v) for 
uv G -£^(5*, S) and Wcutiu, v) = otherwise. Similarly, denote Wuncut = w—Wcut- Denote by Wcut and 
Wuncut the matrices corresponding to Wcut and Wuncut respectively. Finally, let D^ut ^ j^uncut ^ ^^^^^ 
D' be the diagonal matrices defined by Dif = Y,- Wf^\ jjuncut ^ M/.™c«t^ ^ = j^cut ^ j^uncut 
and D' = i)™* _ ^^nnmt ^ 



Lemma 3.1 If w is ^-locally stable where 7 > , ^ then W + D' is a PSD matrix of 

rank n — 1. 

As shown in [oi.] there is an efficient algorithm that correctly solves all instances that satisfy 
the conclusion of the Lemma (As pointed out in the Appendix, the GW-algorithm solves all such 
instances). This proves the second part of Theorem 1.2. 

Proof First, we note that it is enough to prove that D~2(W + D')D~2 is a PSD matrix of rank 
n — 1. Let / : F — )• M be the vector defined by fi = y/Da for i G 5 and fi = —yJDa for i £ S. 
Since f'^D~2(W + D')D~^f = 0, it is enough to show that v^D~2{W + D')D^2v > for every 
unit vector v that is orthogonal to /. Note that 

D~^W + D')D'^ ^ jj-\^jjcut j^yycut _ jjuncut j^y^uncut^^jj-l ^2) 

The matrix Z?" 2 (ly^ut _^ j^cut^^jj^i is positive semi-definite and / is in its kernel (to see that, note 
that for u G M", M^(Ty^"* + D'''^^)u = Wff{ui + Ujf). Therefore we have 

^;^D-5(iy^«* + D'="*)D-|t; > A2 (3) 

where = Ai < A2 < . . . < A„ are the eigenvalues of ^-^(VF™* + D'^"*)!)-!. Moreover, VF''"^"* + 
jjuncut ^ ^ 2L>"'^™* ^ jjuncut _ uncut ^ ^YieYQ A >z B means that the matrix A - B is PSD. 
Thus, we have, 

1 111 ryuncut 9 
v^D^a (L»""""* - P^™^"*)L'-2t; < 2 • t;^D-2D""^"*Z)-2^; < 2 • max < (4) 



Da -7 + 1 



Combining equations (2), (3) and (4), it is enough to show that A2 > However, since Wcut is 

1 ,„ i.„„. J 1 



2_ 

+: 

bipartite, the matrices D~ 2 [D'^^* +W'^^*)D~ 2 and Z)^ 2 (D™*— 1^™*)D~2 have the same spectrum*. 
Also, D~2{D^^^ — W'^^^)D~2 and D~^{D^'^^ — W^^*) have the same spectrum'^ so it suffices to show 
that /U2 > Tj-^, where fi2 is the second smallest eigenvalue of D~^{D^^^ — W^^^). By the known 
relation between expansion and the second eigenvalue of the Laplacian (e.g.. Theorem 2.2 in [FX]), 



cut 



it follows that /X2 > miuj • (1 - y^l - h{wcutY) > :^(1 - \/l - hiwcut)^) 



□ 



Finally, to prove the first part of Theorem 1.2, it is enough to show that if w is a-distinguished 
then h{wcut) > «• Indeed, for 7^ A C y we have 

'Tw^uA^) = iw{A) > ^n,{A) -Ly,{A)>a- mm{fi^{A),fi^{A)} > a ■ min{n^^^^{A), fi^^^^{A)} 



*To see that, let P : R" — >■ R" be the operator that multiply by —1 the coordinates corresponding to one side of 
the cut and fixes the other. The operator P commute with diagonal matrices and satisfies WP = —PW. Thus, v 
be an eigenvector of _D~2(D'^"* + VP'™')!)- 2 with an eigenvalue A iff Pv an eigenvector of D"2 (D™' + VK''"')7?~2 
with an eigenvalue A. 

^Since i; is an eigenvector of _D~ 2 (D'^"' — W''"*)D^2 with eigenvalue A iff D~2v is an eigenvector of D~^{D''^* — 
Vy"*) with eigenvalue A. 
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4 Algorithms for stable instances 



We begin with a useful observation. 

Observation 2 Let w he a ^-stable instance of MAXCUT, and let w' be obtained from w by 
merging two vertices^ on the same side of w 's maximal cut. Then w' is j-stable and its maximal 
cut is induced from w 's maximal cut. 

By the above observation, we conclude that in order to design an efficient algorithm for 7-stable 
instances, it is enough to show in every 7-stable instance, we can efficiently find a pair of vertices 
that are on the same side of the cut. Once two such vertices are found, we merge them and proceed 
recursively. This applies as well when 7 is not a constant, but a non-decreasing function of n. 

As an easy warm-up, we show how this observation yields a simple efficient algorithm that solves 
every 2n-stable instance w : V x V ^ M oi MAXCUT. This is a simplification of an algorithm 
from [BL]. By observation 2, it suffices to find two vertices which are on the same side of the 
maximal cut. Pick an arbitrary vertex v £ V. If f u is the heaviest edge incident with v, then 
clearly w{v,u) > ^^T(f). On the other hand, by observation 1, l{v) < 2^^t(u), so w{v,u) > t,{v) 
and we conclude that vu is a cut edge. Now, let e be the heaviest edge incident with {u,v}, say 
e = vz. Again, w{v,z) > 2(n-2) ''"(i^' ^i) ^^'^ observation 1, l{{v,u}) < ^^^^df , n}), implying 
that w{v,z) > l{{v,u}). Consequently vz is a cut edge. But since vz and vu are cut edges, the 
vertices z and u are on the same side of the cut. 



4.1 A deterministic algorithm for 0(y^)-stable instances 

Following observation 2, the algorithm we present will find two vertices which are on the same side 
of the cut. Let w -.V xV ^Rhe a 7-stable instance of MAXCUT with 7 > y/8n + 4 + 1 and let 
{S, S) be a maximal cut. We first deal with very heavy edges. Define 

:= {vu : w{v,u) > — ^pr j-M(f)} 

By observation 1, all edges in are cut edges. Thus if there are two incident edges uv, vz G T^, 
then u and z are on the same side of the cut and we are done. It remains to consider the case 
where is a matching. Define 

= {uv 4 : w(u,v) > '''({u, z}) for some uz G T^} 

7 + 1 

Again, by observation 1, all edges in are cut edges. If is nonempty, say to E T^, then there 
exists some uz S with vu{u,v) > ;^^r({n, z}), which implies that v and z are on the same side 
of the cut. We proceed to consider the case where is empty. 
For every u,v G V define 

jo vu G I t{{u, v}) vu G for some u G V 

'w[u,v) = i , w{v) = < , , , 

\uj[u,v) o/w [''"(^j o/w 

Note that w{v) is well defined, since is a matching by assumption. Since = and 
is a matching, we have, for every u G V ., w{v,u) < ::^w{v) and, again by observation 1, i{v) < 



®Let w : V xV ^ Whe an instance and let v,u € V. The instance w' : V' xV' M. obtained upon merging v, u is 
defined as follows. V' = V\{u, v}yj{v'} and w' (x, y) = w{x, y) for a;, y G V\ {i;, u}, also, w'^v' , x) = w{v, x) + w{u, x). 
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Next, we observe as well that separated vertices cannot have too many common neighbors. 

For u,v G V we define n{u, v) := J2zgv '"^('^' z)'w{z, u). If v and u are separated, say v £ S,u £ S, 
then 



n{u,v) = w{v, z)w{z, u) + w{ 



V, Z)W(Z, u) 

z&s zes 

< -^—w{v) ■ l{u) + -^—w{u) ■ i{v) 
7+1 7+1 

Thus, it suffices to find two vertices v,u with n{u,v) > ^^^-^,^2 'w{u) ■ w{v), and place them on the 
same side of the cut. Indeed, if no such pair exists we have 



= ■w{u, z)w{z,v) 

u,v,z(^V 

= E ri{u,v)+ ^ w^{u,z) 

^ ' ' u,v&V, Ut^v u&V ^ z&V 



And it follows that 7 < ^/Sn^^^ + 1 . A contradiction. 



5 Conclusion and open problems 

Our results together with work from [AB, ABS, BL, DLS, BL2] show that in many cases practically 
interesting instances of hard problems are computationally feasible. Still much remains to be 
done toward a new paradigm of analyzing the complexity of computational problems of practical 
significance. Even if we restrict our attention to MAXCUT, many problems remain open. Here are 
some of the more significant challenges: 

• Following [ )L.], we recall the (admittedly bold) conjecture that there is a constant 7* > 1, 
s.t. 7*-stable instances can be solved in polynomial time. 

• It is interesting seek the best possible dependency of 7 on a in Theorem 1.2. We are quite 
certain that further improvements are possible. 

• With reference to Corollary 2.6, can you find a practically efficient algorithm for, say, 2-locally 
stable metric instances? 
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A The Spectral approach and the GW algorithm 

Convex programming relaxations play a key role in the study of hard computational problem. They 
mostly play a prominent role in the search for approximate solutions. The Goemans- Williamson 
(GW) approximate solution for MAXCUT is a prime example of this approach. Can such algorithms 
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provide as well exact solutions for practically interesting instances? Many papers (e.g. [B, DP, GW, 
M]) study the relationships between the maximal cut and spectrum of matrices associated with the 
instance. Such ideas have led to various heuristics and approximation algorithms for MAXCUT. 
In section A we ask under which conditions those methods solve MAXCUT exactly. As shown is 
Section 3, distinguished instances satisfy such conditions. 

We need some terminology. We identify an instance w of MAXCUT with an n x n matrix W, 
where Wij = w{i,j). A vector v G is called a generalized least eigenvector ( GLEV) of W if there 
is a diagonal matrix D such that v it is an eigenvector oiW + corresponding to {W + D)^s least 
eigenvalue, A. By letting A := D — XI we see that f is a GLEV iff v is in the kernel of + A for 
A diagonal with W + D ^ 0. (As usual A ^ means that A is positive semi-definite). A vector 
V £ M" induces the cut {S, S) where S = {i : Vi > 0} . An algorithm for MAXCUT is called spectral 
if it always returns a cut that is induced by a GLEV. 

Many popular approximation algorithms and heuristics for MAXCUT are spectral. They usually 
work by returning the cut induced by w^s lowest eigenvector (LEV) or by LEV's of related matrices. 
As we note below, the GW-algorithm is also spectral. Here is the underlying logic of this approach. 
The characteristic vector of the cut {S, S) is defined as 6s = Xs — Xs where XA '■ V ^ {0, 1} is 
the indicator function of A. If D is a diagonal matrix, then 6'^{W + D)6s = 2vu{V) + Y17=i(^'>-'^ ~ 
Wii) — Aw{S, S). Thus, the MAXCUT problem can be formulated as follows 

minimize v'^ {W + D)v 

(5) 

subject to u G {1, -1}"' 
A natural relaxations to this problem is. 

minimize v'^ iW + D)v 

(6) 

subject to = 1 

where 1 1 • 1 1 denotes the Euclidean norm. Now the set of solutions v of (6) coincides with the set of 
least eigenvectors olW + D. In view of (5), it is natural to consider the cut induced by such v. 



The GW- Algorithm 

There is another relaxation to (5), that seems unrelated to (6). It was suggested by [( iV\ ] and will 
play a major role in the sequel. In problem (5) we seek n vectors fi, . . . ,f„ in the dimensional 
sphere = {—1, 1} to minimize Y^i j Wij{vi, vj). Interesting relaxations are obtained by replacing 
5^^ with S"" for some m. As observed by [ ] for m = n — 1, the relaxation 

minimize Wi j { Vi , Vj ) 

i.o ' (7) 
subject to Vi e 

is feasible. In the ideal case, the solution vi, . . . ,Vn of (7) is contained in a copy of 5", embedded in 
5"~^. which makes it a solution for (5) (in its new formulation). Thus, in the ideal case, separated 
vectors correspond to two antipodal points, and all vertices that are on the same side of the cut 
get mapped to the same point. Even if this ideal picture does not hold, one may expect that the 
angle between separated vertices be large. Therefore, to extract a cut from vi, . . . ,Vn we need a 
method that tends to (combinatorially) separate vertices whose images on the sphere are far apart. 
In ['^ -AV] this is done by returning the cut induced by the vector u £ defined by Ui = {v, Vi) 
where v G S"""^ is sampled uniformly. This yields the approximation ratio 0.879. 
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To solve (7) the GW algorithm finds first a solution P to the problem 

minimize P oW 

subject to P ^ (8) 

P^^ = 1, Vi e [n] 

Where P oW := X^x<j j<n ^ij ' ^ij- Since P it is possible to find next vectors vi, . . . ,Vn such 
that Pij = {vi,Vj). The dual to (8) is (see [GW]) 

n 

maximize Da 

(9) 

subject to W - DhO- 
D is diagonal 

As observed in [v w], by SDP duality the optima of (8) and (9) coincide. Denote by T'{W) 
and 'D{W) the set of optimal solutions to (8) and (9) respectively. Denote also V = {P £ Mn(M) : 
P ^ and Vi, Pii = l},V = {D £ M„(M) : D is diagonal}. We say that W is GW-bipolar if there 
exists a solution to (9) that also solves the binary problem (6) (i.e., it is contained in a copy of 
embedded in Equivalently, W is GW-bipolar if V{W) contains a matrix of the form v ■ 

for some v G {—1, 1}". Finally, we shall say that W is strongly GW-bipolar if every solution to (7) 
is also a solution of (6). Our interest in strongly GW-bipolar instances is clear. The maximal cut 
of such an instance can be immediately read of the output of the GW-algorithm. 

An overview. We start by asking which instances of MAXCUT can be solved exactly by a 
spectral algorithm. As we show, the maximal cut is induced by a ±1 GLEV iff the instance is 
GW-bipolar. More generally, an instance can be correctly solved by some spectral algorithm iff it 
is has a certain perturbation that is GW-bipolar. This provides additional motivation to the study 
of GW-bipolar instances. 

We give a primal-dual characterizing of the set of solutions to the GW-relaxation. Specifically, 
we show that the dual GW problem (9) always has a unique solution D and the solutions of the 
primal problem are V{W) = {P e V : P ■ (W - D) = 0}. This ahows us to conclude that the GW- 
algorithm is a spectral algorithm according to our definition. We also show that GW-bipolarity is 
equivalent to a condition from [BL], under which MAXCUT can be solved exactly in polynomial 
time. 



A.l Cuts induced by GLEV's 

Let w : V X V ^ be an instance with an associated matrix W. We seek conditions under 
which a given cut S is induced by GLEV. Let v G be a vector that induces the cut S. As noted 
before, f is a GLEV if and only if v is in the kernel oiW + D for some diagonal matrix D for which 
W + D ^ 0. Thus, f is a GLEV of W if and only if the optimum of the following SDP is 0. 

minmiize v'^ (W + D)v 

subject to W + D^O (10) 
D is diagonal 



The dual program of (10) is 



maximize v Wv — P oW 
p 

subject to Pa = vf i^^) 

pyo 
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Since (10) has a positive definite solution, strong duality holds. Thus, v is a GLEV iff the optimum 
of (11) is 0. 

Now, the optimum of the dual is iff the perturbation of W defined by W^j = \vi\ ■ \vj\ ■ Wij 
is GW-Bipolar. To see that, note that the mapping P' ^ P where Pij = \vi\ ■ \vj\ ■ Plj maps 
the feasible solutions to the primal GW-relaxation (8) for W' onto the feasible solution to (11). 
Moreover, P oW = P' o W' . Thus, the optimum of (11) is zero iff the optimum of the primal GW 
relaxation of W' is v^Wv = 5^W' 5$- Consequently, the optimum of (11) is iff the optimum of (8) 
is attained by a ±1 vector, making W GW-bipolar. Note that if v "strongly induces" the cut S - 
that is, if all coordinates \vi\ are roughly equal, then W' is just a small perturbation of W . Taking 
this to the extreme, we conclude that the cut is induced by a ±1 GLEV iff W is GW-bipolar. 

A. 2 The GW algorithm and GW-bipolar instances 

We start with a primal-dual characterization of ViyV) and V{W). 

Theorem A.l Let W he a non-negative symmetric matrix with 0-diagonal. Then, 

1. T){W) is a singleton' . 

2. V{W) = {P £V : P{W - V{W)) = 0} 

Lemma A. 2 For every € 'D{W), P° G ViW) we have 

V{W) = {PeV: P{W - D^) = 0} 
V{W) = {DeV:{W-D)>^0, P^{W - D) = 0} 
Proof Let G V{W), P £V. By strong duality, 

n 

P e V{W) o P = Y^D^ o P = o P 

i=l 

Since W - and P are PSDs, P o (W - = ^ P{W - D^) = 0. Thus, 

V{W) = {PeV: P{W - L>°) = 0} 
Similarly, let P° G ViW), D eV such that W - D^O then 

n 

D G V{W) ^WoP^ = J2Di^WoP^ = DoP° 

1=1 

Thus 

V{W) = {DeV:{W-D)hO, P°{W - D) = 0} 

□ 

Proof (of Theorem A.l) Part 2 follows from part 1 and Lemma A. 2, so it only remains to prove 
part 1. Fix some P° G vlw) and let D G !){}¥). By considering the (j, j) entry ofP^(W-D) = 0, 
we have 

n 
i=l 

which determines D uniquely. 

'^Henceforth we usually do not distinguish between V(W) and the single matrix that it contains. 
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□ 



Corollary A. 3 GW is a spectral algorithm. 

Proof Suppose that the optimum of the GW-relaxation is attained at P and let vi, . . . ,Vn G S"'~^ 
be vectors such that Pij = {vi,Vj). Let v S 5"""^ be the vector sampled by the algorithm and let 
^1^=1 ^j'^^j be its orthogonal projection on span{vi, . . . , Vn}- The cut returned by the algorithm is 
the one induced by the vector Ui = {v, Vi) = Y2j ^j^ij- The vector u is a linear combination of P's 
columns. Thus, by Theorem A.l it is in the kernel of the PSD matrix W — T>{W). 

□ 



Corollary A. 4 The GW algorithm correctly solves Q.{rv') -stable instances. 
Proof In [BLl it is shown that if u is a GLEV of a 7-stable instance W such that 7 > 

L J ' ' — mm(ij)gB \uiUj\ 

then u induces the optimal cut. Let u be defined as in the proof of Corollary A. 3. As shown, u is 
a GLEV. Moreover, by an easy probabilistic argument, w.h.p., 

□ 



Here is a characterization of GW-bipolar matrices. 

Theorem A. 5 Let W be an instance for MAXCUT with maximal cut S. Denote v = 5$ and let 
D be the diagonal matrix defined by Da = —vi WijVj. The following conditions are equivalent. 

1. W is GW-bipolar. 

2. 5s is a GLEV of W. 

3. + D ^ 

4-. The optimum of the dual of the GW-relaxation is attained at —D. 



Proof As shown in section A.l condition 1 is equivalent to condition 2. Suppose now that 3 holds. 
It is not hard to see that 6$ is in the kernel oi W + D, so 2 holds. Condition 4 clearly entails 
condition 3. Finally, suppose that 1 holds. Let D' be the solution of problem (9). Since W is GW- 
bipolar, 5s • 5'g is an optimal primal solution. By Lemma A. 2 we deduce that 5s G keriW — D'). 
It follows that D' = —D and 4 holds. 

□ 

As noted before, strongly GW-bipolar instances can be efficiently solved using the GW algorithm. 
In fact, for those instances there is no need to choose a random vector to produce a cut. Moreover, 
those instances can be solved simply by taking the sign pattern of the least eigenvector oiW + D 
where D is the solution to problem (9). As we explain next, strong GW-bipolarity is just slightly 
stronger than GW-bipolarity. Let be a GW-bipolar instance with maximal cut (S*, S). Let W' 
be the (1 + e)-perturbation of W that is obtained by multiplying cut edges by 1 + e with e > 
arbitrarily small. We claim that it is strongly GW-bipolar. Let D be the diagonal matrix defined 
in Theorem A. 5. We have W -\- D '^Q ii and only if for every u G S^~^ 

u^{W + D)u= Wij{ui + Ujf- Wij{ui-Ujf >0 (12) 

ijeE{S,S) ij^E{S,S) 
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Inequality (12) clearly holds for W' as well making it GW-bipolar. Moreover, since the maximal 
cut is connected, if n 7^ ■ 63 then J2ijeE{s,S) ^Iji^i + ^jf > 'l2ij&E{s,S) Wij{ui + Ujf. Thus, 
u^{W' + D')u > u'^{W + D)u > where D' is the matrix corresponding to W from Theorem A. 5. 
Thus, the matrix W' + D' has rank n — 1. By Theorem A.l we conclude that 6s ■ is the only 
solution to the primal GW-problem for W' , making W' strongly GW-bipolar. 



B A randomized algorithm for e • , ^ -stable instances 

We now describe a simple randomized algorithm that correctly solves e • lo^n) 'Stable instances of 
MAXCUT. So let ?i> : 1/ X — )• [0, 00) be a 7-stable instance with 7 = e • j^^^^- Our algorithm 
proceeds as follows. 

1. Set Vb = {vo} for some vq £ V and set £^0 = 0- 

2. For t = 1 to \V\ - 1 

• Sample a random edge vtUt G E{Vt-i,Vt-i), where the probability of every edge is 
proportional to its weight. 

• Set Vt = Vt_.i U {vt, ut}, Et = Et-i U {vtUt} 

3. Note that (Vt, Et) is a tree for every t and for t = \V\ — 1 this is a spanning tree. Return the 
bipartition corresponding to the two-coloring of this tree. 

Analysis: In order to return the maximal cut, it is sufficient (in fact, also necessary) that for 
every t, the edge vtUt be in the maximal cut. But, by observation 1, the edges in E(yt, Vt) that are 
in the maximal cut constitute > ^^q^j- fraction of all the edges in £'(Vj, Vt). Thus a lower bound on 
the success probability of the algorithm can be derived as follows: 



n-l 

> 



1 



7 + 1/ ~ V 7 + 1 



1 



> 



log{n; 

1 



+ 1 



+ o{l)) 



log(ri) / 
ln(n) 



^ln(e-^+o(l)) ^ ^-^+o(l) 



In particular, for e fixed the process succeeds with probability that is at least inverse polynomial 
in n. 
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